Utilizing Goroutines for Concurrency
Understand goroutines in Go and how they are used for concurrency and synchronization.
We'll cover the following
In the modern era of computers, concurrency is the name of the game. In the years before 2005 or so, computers used Moore’s law to double the speed of a single central processing unit (CPU) every 18 months. Multiple CPU consumer systems were rare, and there was one core per CPU in the system. Software that utilized multiple cores efficiently was rare. Over time, it became more expensive to increase single-core speed, and multi-core CPUs have become the norm.
Each core on a CPU supports a number of hardware threads, and operating systems (OSs) provide OS threads that are mapped to hardware threads that are then shared between processes.
Languages can utilize these OS threads to run functions in their language concurrently instead of serially, as we have been doing in all of our code so far.
Starting an OS thread is an expensive operation, and fully utilizing the thread's time requires paying a lot of attention to what we are doing.
Go takes this to another level than most languages with goroutines. Go has built a runtime scheduler that maps these goroutines onto OS threads and switches which routine is running on which thread to optimize CPU utilization.
This produces concurrency that is easy and cheap to use, requiring less mental burden on the developer.
Starting a goroutine#
Go gets its name from the go keyword that is used to spawn a goroutine. By applying go before a function call, we can cause that function to execute concurrently with the rest of the code. Here is an example that causes 10 goroutines to be created, with each printing out a number:
Note: We'll also notice that this panics with an error after running. This is because the program will have no running goroutines, which means the program is effectively dead. It is killed by Go's deadlock detector. We'll handle this more gracefully later.
Running this will print out the numbers in random order. Why random? Once we are running concurrently, we cannot be sure when a scheduled function will execute. At any given moment, there will be between 0 and 10 goroutines executing fmt.Println(x), and another one executing fmt.Println("hello"). That's right—the main() function is its own goroutine.
Once the for loop ends, fmt.Println("hello") will execute. hello might be printed out before any of the numbers, somewhere in the middle, or after all the numbers. This is because they are all executing at the same time like horses on a racetrack. We know all the horses will reach the end, but we don't know which one will be first.
Synchronization#
When doing concurrent programming, there is a simple rule: We can read a variable concurrently without synchronization, but a single writer requires synchronization. These are the most common methods of synchronization in Go:
The channel data type to exchange data between goroutines.
MutexandRWMutexfrom thesyncpackage to lock data access.WaitGroupfrom thesyncpackage to track access.
These can be used to prevent multiple goroutines from reading and writing to variables at the same time.
It is undefined what happens if we try to read and write to the same variable from multiple goroutines simultaneously (in other words, that is a bad idea).
Reading and writing to the same variable concurrently is called a data race. Go has a data race detector not covered in this course to uncover these types of problems.
The WaitGroups#
A WaitGroup is a synchronization counter that only has positive values starting at 0. It is most often used to indicate when some set of tasks is finished before executing code that relies on those tasks.
A WaitGroup has a few methods, as outlined here:
.Add(int): Used to add some number to theWaitGroup.Done(): Subtract 1 from theWaitGroup.Wait():Block untilWaitGroupis 0
In our previous section on goroutines, we had an example that panicked after running. This was due to having all goroutines stopped. We used a select statement to block forever to prevent the program from exiting before the goroutines could run, but we can use a WaitGroup to wait for our goroutines to end and exit gracefully.
Let's do it again, as follows:
This example uses a WaitGroup to track the number of goroutines that are outstanding. We add 1 to wg before we launch our goroutine (do not add it inside the goroutine). When the goroutine exits, the defer statement is called, which subtracts 1 from the counter.
Note: A
WaitGroupcan only have positive values. If we call.Done()when theWaitGroupis at 0, it will cause a panic. Because of the way they are used, the creators knew that any attempt to reach a negative value would be a critical bug that needs to be caught early.
wg.Wait() waits for all the goroutines to finish, and calling defer wg.Done() causes our counter to decrement until it reaches 0. At that point, Wait() stops blocking and the program exits the main() function.
Note: If passing a
WaitGroupin a function or method call, we need to use awg := &sync.WaitGroup{}pointer. Otherwise, each function is operating on a copy, not the same value. If used in a struct, either the struct or the field holding theWaitGroupmust be a pointer.
Channels#
Channels provide a synchronization primitive in which data is inserted into a channel by a goroutine and removed by another goroutine. A channel can be buffered, meaning it can hold a certain amount of data before blocking, or unbuffered, where a sender and receiver must both be present for the data to transfer between goroutines.
A common analogy for a channel is a pipe in which water flows. Water is inserted into a pipe and flows out the far side. The amount of water that can be held in the pipe is the buffer. Here, we can see a representation of goroutine communication using a channel:
Channels are used to pass data from one goroutine to another, where the goroutine that passed the data stops using it. This allows us to pass control from one goroutine to another, giving access to a single goroutine at a time. This provides synchronization.
Channels are typed, so only data of that type can go into the channel. Because channels are a pointer-scoped type such as map and slice, we use make() to create them, as follows:
The code above creates a channel called ch that holds a string type with a buffer of 1. Leaving ", 1" off will make it an unbuffered channel.
Sending/receiving#
Sending to a channel is done with the <- syntax. To send a string type to the preceding channel, we could do the following: ch <- "word". This attempts to put the "word" string into the ch channel. If the channel has an available buffer, we continue execution in this goroutine. If the buffer is full, this blocks until either buffer becomes available or—in the case of unbuffered channels—a goroutine tries to pull from the channel.
Receiving is similar using the same syntax but on the opposite side of the channel. The goroutine trying to pull from the channel would do this: str := <-ch. This assigns the next value on the channel to the str variable.
More commonly when receiving variables, the for range syntax is used. This allows us to pull all values out of a channel. An example using our preceding channel might look like this:
Channels can be closed so that no more data will be sent to them. This is done with the close keyword. To close the preceding channel, we could do close(ch). This should always be done by the sender. Closing a channel will cause a for range loop to exit once all values on the channel have been removed.
Let's use a channel to send words from one goroutine to another, as follows:
Note: After a channel is closed, sending a value on a channel will cause a
panic. Receiving from a closed channel will return the zero value of the type the channel holds. A channel can benil. Sending or receiving from anilchannel can block forever. It is a common bug for developers to forget to initialize channels in a struct.
In this lesson, we have gained basic skills in using goroutines for concurrent operations and learned what synchronization is and when we must use it.
Using Defer, Panic, and Recover
Select and Mutexes in Goroutines